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Abstraot 


Ve  consider  s  class  of  iterative  algorithms  for  solving  systems  of 
linear  equations  where  the  coefficient  matrix  is  nonsymmetric  with 
positive-definite  symmetric  part.  The  algorithms  are  modelled  after  the 
conjugate  gradient  method,  and  are  well-suited  for  large  sparse  systems. 
They  do  not  make  use  of  any  associated  symmetric  problems.  Convergence 
results  and  error  bounds  are  presented. 
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1.  Introduction 

The  conjugate  gradient  aethod  (CG),  first  described  by  Hestenes  and 
Stiefel  [8],  is  widely  nsed  for  approximating  the  solutions  of  large  sparse 
systems  of  linear  equations 
A  x  =  f 

where  A  is  an  N  x  N,  real,  symmetric,  positive-definite  matrix  [1,  3,  5, 

12].  CG  can  be  viewed  as  a  direct  method  that,  in  the  absence  of  round-off 
error,  gives  the  exact  solution  in  at  most  N  steps;  or  as  an  iterative 
procedure  that  gives  a  good  approximation  to  the  solution  in  far  fewer 
steps  (see  [13]).  A  feature  of  the  method  that  makes  it  particularly 
suitable  for  large  sparse  systems  is  that  all  references  to  A  are  in  the 
form  of  a  matrix-vector  product  Av,  so  that  the  storage  requirements  are 
usually  lower  than  for  direct  methods.  Another  attractive  feature  is  that, 
unlike  most  iterative  methods,  CG  does  not  require  any  estimation  of 
parameters.  In  this  paper,  we  discuss  a  class  of  conjugate-gradient-like 
descent  methods  that  can  be  used  to  solve  nonsymmetric  systems  of  linear 
equations.  Numerical  experiments  with  ^hese  methods  are  described  in  [6, 
71. 


A  common  technique  [8]  for  solving  nonsyanetric  problems  is  to  apply 
the  conjugate  gradient  method  to  the  normal  equations 
ATA  x  -  ATf  , 

in  which  the  coefficient  matrix  is  symmetric  and  positive-definite.  On  the 

i’th  iteration,  CG  computes  an  approximate  solution  that  is  in  some  sense 

optimal  in  a  Krylov  subspace  of  the  form  {v,A1Av, . . . , (AaA)  v] .  This 
T 

dependence  on  A*A  tends  to  make  the  convergence  of  CG  slow  (see  [1],  [3]). 


Page  3 


Recently,  Concns  end  Golnb  [4]  end  Widlund  [18]  devised  e  generalized 
conjugate  gradient  algorithm  (GCG)  for  nonsymmetric  systems  in  which  the 
coefficient  aatrix  has  positive-definite  symmetric  part.  Like  the 
conjugate  gradient  method,  GCG  gives  the  exact  solution  in  at  most  N 
iterations.  However,  on  each  iteration  it  requires  the  solution  of  an 
auxiliary  system  of  equations  in  which  the  coefficient  matrix  is  the 
symmetric  part  of  A.  Also,  if  the  nonsymmetric  part  is  relatively  large, 
then  convergence  may  be  slow. 

The  methods  we  present  depend  on  a  Krylov  sequence  based  on  A  rather 
T 

than  A  A,  and  they  do  not  require  the  solution  of  any  auxiliary  systems. 
They  do  require  that  the  symmetric  part  of  A  be  positive-definite.  In 
Section  2,  we  present  four  variants  that  differ  in  their  work  and  storage 
requirements.  In  Sections  3  and  4,  we  present  convergence  results  and 
error  bounds  for  each  of  the  four  variants.  In  Section  5,  we  discuss 
several  alternative  formulations. 

A+A^ 

The  symmetric  part  of  the  coefficient  matrix  A  is  given  by  M  :*  — , 

and  the  skew- symmetric  part  by  R  :* - ^ — .  Thus  A  =  M  -  R.  The  Jordan 

canonical  form  of  A  is  denoted  by  J  :*  T  *A  T. 

For  any  square  matrix  X,  let  ^B£n(X)  denote  the  eigenvalue  of  X  of 
smallest  absolute  value,  and  let  ^Baz(X)  denote  the  eigenvalue  of  largest 
absolute  value.  The  spectral  radius  l^Baz(X)i  of  X  is  denoted  by  p(X). 
The  set  of  eigenvalues  of  X,  also  called  the  spectrum  of  X,  is  denoted  by 
<r(X).  If  X  is  nonsingular,  then  the  condition  number  of  X,  K(X) ,  is 
defined  as  Ilx|l2  ||X_1ll2. 
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Finally,  given  a  aet  of  vectors  {p0,...,pk).  let  <pQ,...,pk>  denote 
the  space  spanned  by  (pQ,...,pk). 

2.  Desfiejit  Mc.tkfiiU  &£  Nonsv— letric  Sts teas 

In  this  section,  ve  present  a  class  of  descent  Methods  for  solving  the 
systea  of  linear  equations 
(2.1)  A  x  -  f 

where  A  is  a  nonsyaaetric  aatrix  of  order  N  with  positive-definite 
syaaetric  part.  Ve  consider  four  variants,  all  of  which  have  the  following 
general  fora: 

(2.2a)  Choose  xn 

e 

Convergence  DO 


1 1 ri+i 1 1 2  *  I lf-A(x1+ap1) 1 12  as  a 
function  of  a,  so  that  the  Euclidean  nora  of  the  residual  decreases  at  each 
step.  The  variants  differ  in  the  technique  used  to  coapute  the  new 

i+1* 


(2.2b) 

(2.2c) 

(2 .2d) 
(2.2e) 
(2.2f) 

(2.2g) 


Coapute  rQ  “  f  -  AxQ 
Set  P0  -  rQ  . 

FOR  i  *  0  STEP  1  UNTIL 


(rl.Api) 

"  (Apj.Apj) 

*i+l  "  xi  +  Vi 

ri+l  "  ri  -  *iApi 
Coapute  pi+1 


The  choice  of  a^  in  (2. 2d)  ainiaises 


direction  veotor  p 


5 


A  good  choice  for  p^+2  is  one  that  results  in  a  significant  decrease 
in  the  non  of  the  residual  llr^^ll^  but  does  not  require  a  large  amount 
of  work  to  compute.  When  A  is  symmetric  and  positive-def inite,  such  a 


vector  can  be  computed  by  the  simple  recurrence  relation 
(2.3a) 
where 


pi+l  "  ri+l 


+  biPi 


(2.3b) 


<Ari+l'Api) 

(Ap^Apj) 


The  method  defined  by  (2.2)  and  (2.3)  is  equivalent  to  a  variant  of  CG 

known  as  the  conjugate  residual  method  (CR)  [16] .  The  direction  vectors 
T 

produced  are  A  A-orthogonal,  that  is 
(2.4)  (Ap^.Apj)  =  0  ,  for  i  ^  j  , 

end  minimizes  the  functional 

E(w)  :=  I  If  -  Awl l2 

over  the  affine  space  xQ  +  <pQ,....pi>. 


If  A  is  nonsymmetric  and  the  algorithm  defined  by  (2.2)  and  (2.3)  is 

applied  to  (2.1),  then  the  orthogonality  relation  (2.4)  does  not  hold  in 

T 

general.  However,  a  set  of  A  A-orthogonal  directions  can  be  generated  by 

using  all  the  previous  vectors  Cp j } to  compute  Pi+j! 

i 

(2.5.)  »U1  -  rltl  *  £  bjl>Pj  . 

where 


(2.5b) 


(Ar^j.Ap  ) 

(Ap-VAP|) 


j  i  i 


The  iterate  x^+2  generated  by  (2.2)  and  (2.5)  minimizes  E(w)  over 
x0  +  <Po»***»Pi>  (see  Theorem  3.1).  We  refer  to  this  algorithm  as  the 


generalized  conjugate  residual  method  (OCR).  In  the  absence  of  roundoff 
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error,  OCR  gives  the  exact  solution  to  (2.1)  i&  at  most  N  iterations  (see 
Corollary  3 .2) . 


The  work  and  storage  requirements  per  iteration  of  GCR  may  be 
prohibitively  high  when  N  is  large.  Vinsome  [17]  has  proposed  a  method 
called  Orthomin  that  can  be  viewed  as  a  modification  of  GCR  that  is 

significantly  less  expensive  per  iteration.  Instead  of  making  p^+1 

T  i 

A  A-orthogonal  to  all  the  preceding  direction  vectors  Cp^ ) »  °oe  can  make 

Pi+1  orthogonal  to  only  the  last  k  (£.  0)  vectors  *Pj^j=i-k+l: 

i 


( 2.6 ) 


Pi+1  =  ri+l  + 


h  ( i) 
bj  pj 


j=i-k+l 

with  defined  as  in  (2.5b)  .*  Only  k  direction  vectors  need  be 
saved.  We  refer  to  this  method  as  Orthomin(k)  (see  [19]).  Both  GCR  and 
Orthomin(k)  for  k  2  1  are  equivalent  to  the  conjugate  residual  method  when 
A  is  symmetric  and  positive-definite. 


Another  alternative  is  to  restart  GCR  periodically:  every  k+1 

iterations,  the  current  iterate  xj(%+i)  taken  as  the  new  starting 

2 

guess.  At  most  k  direction  vectors  have  to  be  saved,  so  that  the  storage 
costs  are  the  same  as  for  Orthomin(k) .  However,  the  cost  per  iteration  is 
lower,  since  in  general  fewer  than  k  direction  vectors  are  used  to  compute 
pi+l*  refer  to  this  restarted  method  as  GCR(k). 


1The  first  k  directions  *re  computed  by  (2.5a),  as  in  GCR. 

Here  j  is  a  eou&ter  for  the  number  of  restarts.  The  j  cycle  of  GCR(k) 
produoes  the  sequence  of  approximate  solutions  (k+l)+i* 
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For  the  special  case  k  =  0,  Orthomin(k)  and  GCR(k)  are  identical,  with 
(2.7)  Pi+1  =  ri+1 

This  method,  which  we  refer  to  as  the  minimum  residnal  method  (NR),  has 
very  modest  work  and  storage  requirements,  and  in  the  symmetric  case 
resembles  the  method  of  steepest  descent  (see  [10]).  Because  of  its 
simplicity,  we  consider  it  separately  from  Orthomin(k)  and  GCR(k) . 

In  Table  I,  we  summarize  the  work  and  storage  costs  (excluding  storage 
for  A)  of  computing  x^  for  each  of  the  methods.  The  entries  in  this  table 
are  determined  as  follows.  For  GCR,  the  storage  cost  includes  space  for 
the  vectors  x^,  r^,  Ar^,  Pq , . . . *Pj_j  *  *ud  Ap^, . . . ,Apj_j .  The 
work/ iteration  includes  two  inner— products  for  two  scalar-vector 

products  for  x^  and  r^,  i  inner-products  for  (b^*  1  scalar-vector 

products  for  p^  i  scalar-vector  products  to  compute  kp^  by 

i-1 

Ap.  =  Ar.  +  l  b<i_1)Ap.  . 

1  1  j=0  3  3 

and  one  matrix-vector  product  Atj.  The  total  is  thus  (3i+4)N  +  1  mv.  The 
entries  for  Orthomin(k)  correspond  to  the  requirements  after  the  k**1 
iteration,  and  are  the  same  as  those  for  the  kth  iteration  of  GCR.  The 
work  given  for  GCR(k)  is  the  average  over  k+1  iterations.  The  cost  of  MR 
is  the  same  as  the  cost  of  Orthomin(O)  or  GCR(O) . 


a 

Several  other  implementations  are  possible.  In  Orthomin(k)  or  GCX(k),  it 
may  be  cheaper  to  eompute  Ap^  by  a  matrix-vector  product  for  large  k.  With 

a  third  matrix-vector  product,  bj*)  can  be  computed  as 

-(ATAr1+i.Pj  )/(Apj,Apj),  and  the  previous  (Ap j }  need  not  be  saved. 


OCR 


Orthoain(k)  I  GCR(k) 


+ 


Work /  1 

(3i+4)N  I 

(3k+4) N 

1  ( (3/2)k+4)N 

1  4N 

Iterationl 

+  1  av  1 

+  1  av 

1  +  1  av 

i  +  1  av 

Storage  1 

(2i+3)N  I 

(2k+3)N 

1  (2k+3)N 

1  3N 

I 

I 

+ 

I 

+ 


Table  X  Work  per  iteration  (mv  denotes  a  aatrix-veetor 
product)  and  storage  requireaents. 


3.  Convergence  of  OCR  tad  GCR(k) 

In  this  section,  we  show  that  GCR  gives  the  exact  solution  in  at  aost 
N  iterations  and  present  error  bounds  for  GCR  and  GCR(k) .  We  first 
establish  a  set  of  relations  aaong  the  vectors  generated  by  GCR.  (See  [8] 
for  an  analogous  result  for  the  conjugate  gradient  aethod.) 

Theorea  3 .1 .  If  (x^},  (r^J,  and  (p^}  are  the  iterates  generated  by  GCR  in 


solving 

the  linear  systea  (2.1),  then  the  following 

relations  hold: 

(3.1a) 

*  0  p 

i  ^  J  ; 

(3.1b) 

(rj.Apj)  **  0  , 

i  >  j  ; 

(3.1c) 

(rj.Apj)  «■  (rj.Arj) 

p 

(3. Id) 

(^.Arj)  -  0  , 

i  >  j  ; 

(3.1e) 

*ri*Apj)  *  <r0,Apj) 

»  i  1  j  » 

(3  .If) 

<?()»•••»?£>  “  <Pg»ApQ» 

• • . »A  pQ>  *  <rQ, . . 

.,ri>  ; 

(3.1g) 

if  rA  jl  0,  then  p^  ^  0 

9 

(3.1h) 

ainiaizes  E(w)  ■ 

1 1 f-Aw 1 1 j  over  the 

affine  space 

x0  +  • 


Proof.  The  directions  (p  }  are  ohosen  so  that  (3.1a)  holds. 
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Relation  (3.1b)  is  proved  by  induction  on  i.  It  is  vacuously  true  for 
i  *  0.  Assuae  that  it  holds  for  i  <.  t.  Then,  using  (2.2f)  and  taking  the 
inner  product  with  Apj , 

(3.2)  ^t+l#^*j^  ~  ~  (Apt,Apj )  • 

If  j  <  t,  then  the  terns  on  the  right-hand  side  are  zero  by  the  induction 
hypothesis  and  (3.1a).  If  j  -  t,  then  the  right-hand  side  is  zero  by  the 
definition  of  a^.  Hence  (3.1b)  holds  for  i  -  t+1. 


For  (3.1c),  by  prenultiplying  (2.5a)  by  A  and  taking  the  inner  product 


with  ri# 


(r^Apj)  =  (rj.Arj)  +  2  bj1  ^  (r^.Apj) 

B  Ar ^ )  , 

since  all  the  terms  in  the  sum  are  zero  by  (3.1b). 


To  prove  (3. Id),  we  rewrite  (2.5a)  as 

j-1 

r  =  p  -  >  p 

J  PJ  tZ0  *  * 


Premultiplying  by  A  and  taking  the  inner  product  with  r^  (i  >  j), 

j-1 

(r^Arj)  =  (tj.Apj)  -  b[j_1)(r.,Apt) 


0  , 


by  (3.1b) 


Relation  (3.1e)  is  proved  by  induction  on  i,  for  i  i.  j.  It  is 
trivially  true  when  i  ■  0.  Assume  that  it  holds  for  i  ■  t  <  j.  Using 
(3.2), 

*rt+l'Apj>  “  <*t*APj)  “  at  (APt'Apj* 
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B  <VAPj)  * 

by  the  induction  hypothesis  and  (3.1a). 


lelation  (3. If)  is  proved  by  induction  on  i.  The  three  spaces 

identical  when  i  =  0.  As suae  that  they  are  identical  for  i  £  t. 

(p j } j ,p  C (r^i • • . *  But  by  (2.5a)/ 

t 


■t+1 


'*«  *  jSo  *“’'i 


are 

Then 


so  that  <Pq, . . . »Pt+j>  i*  a  subspace  of  <rg, . . . »rt+j> .  By  (3.1a) ,  the 
vectors  tPjJj^Q  are  linearly  independent.  Bence,  the  diaension  of 
is  greater  than  or  equal  to  t+1,  which  implies  that  {rj)j*J 
are  linearly  independent  and  <P0, . . . ,pt+1>  ■  <rQ, . . . ,rt+1> .  Similarly,  by 
(2.2f) , 

t 

■w  - -  't*p,  *  ^  »]*'  pj  • 

By  the  induction  hypothesis,  rt>  Apt,  and  (Pjj]»o  8  <P0»Ap0, . . . ,At+1P0> ,  so 
that  <Pq, . . . »Pt+i>  i*  s  subspace  of  <pQ,ApQ, . . . ,At+1p0> .  Again,  the  two 
spaces  are  equal  because  the  {p^}  are  linearly  independent. 


The  proof  of  (3.1g)  depends  on  the  fact  that  the  symmetric  part  M  of  A 
is  positive-definite.  If  Tj  +  0,  then  by  (3.1c), 

^i.Apj)  *  (r^Ar^  =  >  0  , 

so  that  (r^/Ap^)  t  0,  whence  p^,  #  0. 


For  the  proof  of  (3.1h),  note  that 

i 

xi+l  "  x0  +  \j*j  • 

7.  T 

Thus,  B(zi+1>  is  a  quadratic  functional  in  “  (a(),...,a1)  •  Indeed, 

using  (3.1a)  to  simplify  the  quadratic  tens. 


* 
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E(xi+1)2  -  llr0  -  a.  APj||2 


<r0'r0)  '  2  A  V  VApj>  +  A  »j(Apj'Apj)  • 


j“0 


j“0 


Urns,  E(w)  is  minimized  over  x^  +  <P<j»...»Pi>  when 

(rQ,Ap  )  (r  ,Ap  ) 

•  a  i  .  J — ~  b  ■  .  -  J 

“  «  /a  a 


j  (Apj  »APj )  (Apj.Ap^) 


by  (3 .le) . 


Q.E.D. 


Corollary  3 .2 .  GCR  gives  the  exact  solution  to  (2.1)  in  at  most  N 
iterations. 


Proof .  If  »  0  for  some  i  ±  N-l ,  then  Ax^  =  f  and  the  assertion  is 


proved.  If  r^  t  0  for  all  i  i  N-l,  then  t  0  for  all  i  N-l  by  (3.1g) 

I  N  1  _  «  I  .  _  I  .  .  »  _  .  ,  J  . 

?o  * 

.N 


By  (3.1a),  fPjA^o  ar®  ly  independent,  so  that  <p^, . . .  *PN_1>  *  R". 


Hence,  by  (3.1h),  x^  minimizes  the  functional  E  over  R  ,  i.e.,  x^  is  the 
solution  to  the  systea. 

Q.E.D. 


This  result  does  not  give  any  insight  into  how  close  x^  is  to  the 
solution  of  (2.1)  for  i  <  N.  We  now  derive  an  error  bound  for  GCR  that 
proves  that  OCR  converges  as  an  iterative  method.  Let  denote  the  set  of 
real  polynomials  of  degree  less  than  or  equal  to  i  such  that  q^(0)  =  1. 

Theorem  3 .3 .  If  (r^}  is  the  sequence  of  residuals  generated  by  OCR,  then 


(3.3)  llrjlji  .in  1 1,^*)  1 12  I  lr„l  l2  <  [l  -  I  lr„l  I 


"i  *  P1 
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Henoe,  OCR  converges.  If  A  hss  s  complete  set  of  eigenvectors,  then 

(3.4)  llrjlj  I(T)  Mi  ,,r0,,2  * 

where 


IT  :«  min  nsx  lq.(X)l 

qj  s  PA  X  s  o(A) 


Moreover,  if  A  is  normal,  then 
(3.5)  ,,ri,,2  1  Mi  ,,r0,,2 


Proof.  By  (3. If),  the  residuals  {r^}  genersted  by  GCR  are  of  the 
Tj  =  q1(A)r()  for  some  qt  8  Pj.  By  (3.1h), 

(3.6)  1 1 r i  1 1  j  -  min  ||qi(A)r0ll2 

Hi  «  Pi 

The  first  inequality  of  (3.3)  is  an  immediate  consequence  of  (3.6). 
prove  the  second  inequality  of  (3.3),  note  that  for  q^z)  -  1  +  or  e  P 

min  llqi<A)|l2  i  llq^A)1!^  £  llq^A)!!*  . 

4i,Pi 


But 


Moreover , 


llq1(A)ll^ 


max 

x*0 


( ( I+aA) x , ( I+aA) x) 
(x,x) 


■  max 

x+Q 


[‘ 


+  2a 


(x.Ax) 

(x,x) 


+  a 


(Ax, Ax) 1 
(x,x)"J  * 


(Ax, Ax) 
(x,x) 


(x,ATAx) 

(x,x) 


i  a.„uta> 


and,  using  the  positive-definiteness  of  M, 


(x,Ax) 

(x,x) 


(x,Mx) 

(x,x) 


1  X-ln(M)  >  0 


form 


To 

1* 


Hence,  if  a  <  0 
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Since  the  sysoaetric  part  of  A  is  positive-definite,  the  spectrnai  of  A 

lies  in  the  open  right  half  of  the  coaplex  plane  (see  [9]).  Thus,  the 

analysis  of  Manteuffel  [11]  shows  that  Bln  1 1  Q.^( A)  1 1 2  *i  approach 

8  Pi 

sero  as  i  goes  to  infinity,  which  also  inplies  that  OCX  converges. 
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Theorea  3.3  can  also  be  used  to  establish  an  error  bound  for  GCR(k) . 
Corollary  3 .4.  If  {r^}  is  the  sequence  of  residuals  generated  by  GCR(k) , 
then 

(3.7)  ^rj(k+l)^2  ^  [  *ln  ^4k+l*A*^2p  ^r0^2 

Vl  8  Pk+1 


so  that 

(3.8) 


X  ,  00 

-Sia-T-]  7  H'0H2 

X  (A  A) J  0  2 

max 


Hence,  GCK(k)  converges.  Moreover,  if  A  has  a  complete  set  of 

eigenvectors,  then 

(3.9)  l,rj(k+l)M2  1  (I(T)  *Wj  1 1  r0 1 1 2  * 

and  if  A  is  nornal,  then 

(3.10)  ^rj(k+l)  ^2  ^  ***k+l^  ^*0^2 

Proof.  Assertions  (3.7),  (3.9),  and  (3.10)  follow  from  Theorem  3.3.  To 

prove  (3.8),  let  i  ■  jk  +  t  where  0  i  t  <  k.  Then 


r  *  i  00“ 

^'jk+t1  *2  ^  I1  _  ‘  TTt“]  * 1  rjk*  *2 


X  (ATA) 

max 


by  (3.3),  and 


X _.(M) 


max'A  A/ 


by  (3.7)  and  the  second  inequality  of  (3.3). 


Q.E.D. 
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4.  Convergence  o£  OgthWllB 

In  this  section,  we  present  convergence  results  for  Orthoain(k)  end  an 
alternative  error  bound  for  GCR  and  GCR(k).  Ve  also  present  an  analysis  of 
Orth on in  in  the  special  case  when  the  syaaetric  part  of  A  is  the  identity. 

Hie  vectors  generated  by  Orthoain(k)  satisfy  a  set  of  relations 
analogous  to  (3.1): 

Theorea  4 .1 .  The  iterates  (z^} ,  (r^),  and  {p^}  generated  by  Orthonin(k) 
satisfy  the  relations: 

(4.1a)  (Api,Apj)  **  0,  j  -  i-k,...,i-l  ,  i^k  ; 

(4.1b)  (r^Apj)  -  0,  j  -  i-k-1 , . . . ,  i-1  ,  i  2  k+1  ; 

(4.1c)  (ri,Api)  =  (t^.Atj)  ; 

(4. Id)  <ri.Ari_1)  -  0  ; 

(4.1e)  (r^,Apj)  -  (r^_k,Apj)  ,  i  -  j-k,...,j-l,  j  2k  * 

(4. If)  if  rA  *  0,  then  vi  *  0  ; 

(4.1g)  for  i  2  k,  ainiaizes  E(w)  over  the  affine  space 

*i-k  +  <pi-k',*,,pi> 

Corollary  3.4  with  k  =  0  implies  that  Orthoain(O)  (NR)  converges.  Ve 
now  prove  that  Orthoain(k)  converges  for  k  >  0.  Since  the  analysis  applies 
as  well  to  OCR,  OCR(k),  and  MR,  we  state  the  results  in  terns  of  all  four 
aethods.  Recalling  that  R  is  the  skew- syane trie  part  of  A,  we  first  prove 
two  prel ininary  results: 

Leaaa  4.2.  The  direction  vectors  (pi)  and  the  residuals  (r^)  generated  by 
OCR,  Orthoain(k),  GCR(k),  and  NR  satisfy 

(Apj,Apj)  1  (AtjiAfj)  . 


(4.2) 
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Proof.  The  direction  vectors  are  given  by 
P4  ■  ri  +  I  bj*  1>pj 

where  the  liaits  of  the  sua  are  defined  as  in  (2.5)  for  OCR  and  GCR(k),  and 


(2.6)  for  Orthoain(k).  Therefore,  by  the  A  A-orthogonality  of  the  {p^}  and 


the  definition  of  b 


(i-1) 


(Api>Api)  -  (Ari,Ari)  +  2  £  b< i-1) (Ari,Apj)  +  £  (b j i-1) )2(APj ,APj) 


(Ar. ,Ap  )' 

,AVV  '  i  cSJ^T 


i  (Arj.Arj) 


Q.E.D. 


4.3.  For  any  real  z  +  0, 


(4.3) 


(x,Az) 


W> 


(ix.Ax)  lMill(ll)l,tI<«)  ♦  P(«)2 


Proof.  Letting  y  »  Az, 


(z,Az)  _  (y,A  y) 


(Az, Az)  (y,y 


-1  -T 

.  AJ+A.  »  .  — 

'i*  »  y)  *  *+a  ” 

(y»y)  z  a  in'  2  ' 


a-i+a-t 

Thus,  it  suffices  to  bound  lBia( - 5 - Consider  the  identity 

(4.4)  X-1+Y_1  -  lYU+Y)-1!]-1 

which  holds  for  any  nonsingular  matrices  X  and  Y,  provided  that  X+Y  is 

T 

nonsingular.  Vith  X  ■  2A  and  Y  *  2A  ,  (4.4)  leads  to 


-1  -T 

-  C(2A)T(4M)"1(2A)]"1  -  t  (M  -  RX)  M_1  (M  -  R)]"1 
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-  (M  +  a1*-1*)-1 


For  uj  i  Mi 


(x, (M  +  RTM-1R)x)  *  (x,Mx)  +  (Rx,M_1Rx)  >  0 

—1  — T 

•o  that  N  +  RTM-1R  is  positive-def inite .  Therefore  -  is 

positive-definite  end 


X  (£ b±l) _ 1_ _ 

"ln  2  X  (M  +  RTM"1R> 


■  EX 


Bat 


X>ax(M  +  RTK-1R) 


max 

x*0 


[ 


(x.Mx) 

TxtxT- 


+  U.xV^)] 

+  (x,x)  J 


*  X..x<*> 


max 

x^O.RxjW) 


(Rx.M^Rx)  (Rx.Rx) 
(Ex.Rxl  (x,x) 


<  +  X _ (M-1)  I  |RTRl  I 


max 


“  ^„(M)  ♦  p(R)2/XBijl(M) 


Hence 


)  1 


_ 1 _ 

WM>  +  P(»>2/Xllill{M) 


Q.E.D. 


The  following  resnlt  proves  that  Orthomin(k)  converges  and  provides 
another  error  bonnd  for  OCX,  OCX(k),  and  MR. 

Theorw  4,4.  If  (r^J  is  the  sequence  of  residuals  generated  by  OCR, 
Orthoain(k) ,  OCX(k),  or  NR,  then 
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(4.5a) 


max 


(4.5b)  IUJIj  £  [l  - 


_ W"> _ 

WM)  W°  +  P(R) 


]in 


Proof.  By  (2.2f) , 


Therefore. 


I*i+1 1  ^  m  ^ ri*  ri^  ”  2ai^  ri*Api^  +  a£^Pj»^>j) 

_  .,2  ,  <Ti-k’l>2 

ri  2  "  2  TXpi,Api)  Up^Ap^ 

a  a  (til 1 ) 

"  ,,Xill2  ~  Up^ApJ  * 


1 1  *  i+i  1 1 2  <*i»Apt)  Uj.Apj) 

Hr. ||2  1  fVri>  tApi'Api^ 

X  st 


<  1  - 


by  (3.1o)/(4.1o)  and  (4.2).  Bat 


(r^.Ar^)  (rj.Arj) 


ri'ri}  Uri*Ari> 


(r. ,Ar j) 

Tr~^T  *  xmin(M)  ' 


(rj.Arj)  ^ri,ri^  (r^.Ar^)  ^jja^ 

IET5?  "  (r^ArJ  7ii'ri)  *  kw^k) 


1  (Mi 


so  that 


Pate  19 


which  prove*  (4.5a).  By  (4.3). 

<*i^fi> _ W™ _ 

(Arj.Arj)  Z  X-la(«)Xte(M)  +  p(R)2 


so  that 


I  l'i+l"2  L  t1  " 


W> 


W)l..,(,l)  *  ?<•> 


- - -J'2  "'iH: 


which  prove*  (4.5b). 


Q.E.D. 


In  general,  the  two  error  bounds  given  in  Theorem  4.4  are  not 

comparable.  They  are  equal  when  K  *>  I  and  (4.5b)  is  stronger  when  R  =  0. 

>A  X  ,  (A) 

When  R  -  0,  the  constant  ^ - - ^  in  (4.5b)  resembles  the 

X  (A)-X  .  'A) 

constant  |  .  .  ■  m  1  '  in  the  error  bound  for  the  steepest  descent 

L  ■ar'A^+*’«in'A^  J 

■ethod  (see  [10]).  Thus,  we  believe  that  the  bounds  in  Theorem  4.4  are  not 
strict  for  k  i  1. 

If  A  ■  I  -  R  with  R  skew- symmetric,  then  Orthomin(l)  is  equivalent  to 
OCR,  and  we  can  improve  the  error  bounds  of  Theorem  3.3  and  Theorem  4.4. 


Theorem  4.5.  If  A  -  I  -  R  with  R  skew-symmetric,  then  Orthoain(l)  is 
equivalent  to  OCR.  The  residuals  (r^)  generated  by  Orthoain(l)  satisfy 


(4.0 


for  even  t 


Page  20 


Proof.  To  prove  that  Orthoain(l)  is  equivalent  to  GCS,  it  suffices  to  show 
that  b j  =  0  in  (2.5b)  for  j  <.  i-1.  But  the  nnaerator  is 

(Ar j.+i ,Apj )  n  "  ^i+l'^j^  * 

By  (3.1b). 

(ri+l'A^j)  =  "  ^ri+l'APj^  =  ®  • 

Hence,  by  the  skew- syaae try  of  R, 

2 

(Ar i+i < Apj )  «  -  (ri+^,Apj)  +  (ri+1,RApj)  =  -  (*i+1.A  Pj) 

But  by  (2.2f) , 

(r1+i,A2pj)  =  —  (ri+i»A(rj-rj+1))  -  0  , 

J 

for  j  1  i-1,  by  (3. Id). 


For  (4.6),  observe  that  A  *  I  -  R  is  a  normal  matrix,  so  that  (3.5) 
holds.  Ve  prove  (4.6)  by  bounding  M^.  Widlnnd  [18]  has  shown  that 

(4.7)  i  [cosh(t  Cl  +  ^+p(R)2) ) )  1  2  » 

for  even  t.  Let  q  -  +  ^.+p(R)2) .  Using 

cosh( a)  *  j  (#  +  e  )  , 


(4.7)  reduces  to 


_t  .  -t 
q  +  q 


q2t  +  i 


from  which  (4.6)  follows. 


Q.E.D. 
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s.  Qtfa&r  AamauaJm 


In  this  section,  we  discuss  several  methods  that  are  mathematically 
equivalent  to  GCR. 


Ve  derived  GCR  from  CR  by  replacing  the  short  recurrence  for  direction 

T 

vectors  (2.3)  with  (2.5),  which  produces  a  set  of  A  A-orthogonal  vectors 

when  A  is  nonsymmetric.  Young  and  Jea  [19]  present  an  alternative, 

T 

Lane z os- like  method  for  counting  A  A-orthogonal  direction  vectors: 

i 

(5.1a)  p'  .  -  Ap'  +  I  b(i)p'  , 

j-0  J  J 

where 


(5. lb) 


(i)  (A2p',Ap») 

J  ""(Ap'.Ap*) 


j  i  i 


If  (pj)  is  the  set  of  direction  vectors  generated  by  GCR  and  p^  *  pQ,  then 
p'  =  Cjpj  for  some  scalar  Cj  (see  [191).  Hence,  this  procedure  can  be  need 
to  compute  directions  in  place  of  (2.5).  The  resulting  algorithm  is 
equivalent  to  GCR,  but  does  not  require  the  symmetric  part  of  A  to  be 
positive-definite . 


Azelsson  [2]  takes  a  somewhat  different  approach.  Let  x^,  rQ,  and  pQ 

be  as  in  (2.2).  Then  one  iteration  of  Axelsson's  method  is  given  by: 

i 

(5.2a)  x. 


‘*+1  ’  *  £> 


(5.2b) 

(5.2c) 

(S.2d) 


ri+l  "  f  "  ^i+l 
(Ari+1,APi) 
®i  "  "  (APl,Ai‘) 


*i+l  *  ri+l  +  Vi 
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where  the  scalars  {aj^lj^  are  computed  so  that  I  lri+1l  i*  minimized. 
This  requires  the  solution  of  a  syaaetric  system  of  equations  of  order  i+1 
B|(i)  -  l. 

where  Bjt  ■  (Ap#,Apt)  and  g#  “  (ri,Ap#).  Thus,  the  solution  update  is  aore 
complicated  than  in  OCX,  hut  the  coaputation  of  a  set  of  linearly 

independent  direction  vectors  is  simpler.  Although  the  direction  vectors 

T  (  i— 1 )  i 

are  not  all  A  A-orthogonal,  (5. 2d)  and  the  choice  of  {aj  '  )j=Q  *orce 

I l^i I I2  “  "in  I lq^(A)rg I I2 

qi  8  Pi 

to  be  satisfied,  so  that  this  aethod  is  equivalent  to  GCR. 

If  these  aethods  are  restarted  every  k+1  steps,  then  the  resulting 
aethods  are  equivalent  to  GCX(k) .  Both  aethods  can  also  be  aodified  to 
produce  aethods  analogous  to  Orthoain(k) :  only  the  k  previous  vectors 

^Pj)j*i~k+1  *r®  u8#d  in  *nd  oal7  the  k  vectors  tPj)j=i-k+l  are  ®*ed 

in  (S.2a),  with  (aj computed  to  ainiaiae  Ilri+ill2*  However, 
both  these  truncated  aethods  aay  fail  to  converge  in  soae  cases.  (We  have 
encountered  situations  in  which  such  failure  occurs  for  the  truncated 
version  of  (3.1);  see  [2]  for  a  discussion  of  the  truncated  version  of 
(5.2).)  For  this  reason,  we  favor  the  formulation  of  OCX  given  in 
Section  2. 

In  discussing  the  aethods  of  this  paper,  we  have  emphasized  their 
variational  property,  i.e.,  that  x^  is  such  that  1 1 r ^ 1 1 2  i*  minimized  over 
soae  subspace.  Saad  [14,  13]  has  developed  a  class  of  CG-like  aethods  for 
nonsyaaotric  probleas  by  restricting  his  attention  to  the  properties  of 
projection  and  orthogonality.  Let  [▼j)j-Q  “d  Cwj J two  sets  of 
linearly  independent  veotors,  and  let  <vq,...,v^>  and 
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Li  : “  <wQ . wA>.  Seed  defines  an  oblique  projection  method  as  one  that 

computes  an  approximate  solution  i  whose  residual  ri+1  is 
orthogonal  to  For  example,  GCR  is  such  a  method  with  ^  «  <Pq....»Pj> 
and  Lj  =  <ApQ,...,Api>. 


Saad  presents  several  oblique  projection  methods  in  [14,  IS].  One  of 


these  is  in  some  sense  an  alternative  formulation  of  GCR.  Let 

Vq  =  1 1  r°  |  |~»  *®d  let  {-v^ }  t=l  by 

10  2  t 


(5.3) 


Vl.tVl  '  Avt  '  Vi 


where  [h^i^g  ar®  chosen  so  that 


(vt+i »AVj )  “0  i  j  i  t  , 

and  ht+1  t  is  chosen  so  that  Ilvt+1ll2  “  Let  be  the  solution  of 

the  system  of  equations 

(5.4)  HiA(i)  -  H*0II2  (1,0 . 0)T  , 


where  Hi  is  the  upper-Hessenberg  matrix  whose  nonxero  elements  are  the  h^t 
defined  above,  and  let 

i 

(5.5)  x , 


T  (i) 

■*«  '» 


By  construction,  x^  s  xQ  +  where  ^  <vQ, . . . ,Vi>=<v0,Av0, 


(Aiv0>. 


It  can  be  shown  that  v^  is  proportional  to  r^,  *°  *bat  ri+1  i* 
orthogonal  to  Li  :■  <AvQ, . . .  .Avj) .  It  can  also  be  shown  that  xi+1 
minimises  llr^llj  over  Xq  +  <Vq ,AvQ, . . .  ,AivQ>,  so  that  xi+1  is  equal  to 


the  (i+l)’st  iterate  generated  by  GCR 
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Note  that  the  approxiaate  eolation  xi+1  it  computed  only  after  {vt)t«0 

have  been  oomputed,  so  that  this  method  lends  itself  naturally  to 

restarting.  Several  other  heuristics  can  be  used  to  cut  expenses 

(see  [14,  15]).  In  particular,  the  computation  of  the  (vt)  can  be 

truncated,  so  that  at  most  k  vectors  are  used  to  compute  v^+^: 

t 


(5.6) 


h  v 
t+1 , tTt+l 


Avt  "  -  Ntv* 

1  j=max(0,t-k+l)  J  J 


This  procedure  can  then  be  integrated  into  an  algorithm  with  restarts  every 
i+1  steps,  for  i  >  k.  After  {vt)*=Q  have  been  computed  by  (5.6),  xi+^  i» 
computed  as  in  (5.4)  and  (5.5),  and  the  algorithm  is  restarted.  The  effect 
of  truncating  the  cosiputation  of  the  (vt)  is  to  make  a  banded 
upper^-Hessenberg  matrix  with  bandwidth  k.  Ve  do  not  know  when  this  method 
converges. 
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